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5.1 Discrete Random Variables: Introduction TCC 
This module serves as the introduction to Discrete Random Variables in the 
Elementary Statistics textbook/collection. 


Student Learning Objectives 
By the end of this chapter, the student should be able to: 


¢ Recognize and understand discrete probability distribution functions, 
in general. 

e Calculate and interpret expected values. 

e Recognize the binomial probability distribution and apply it 
appropriately. 


Introduction 


A student takes a 10 question true-false quiz. Because the student had such 
a busy schedule, he or she could not study and randomly guesses at each 
answer. What is the probability of the student passing the test with at least a 
70%? 


Small companies might be interested in the number of long distance phone 
calls their employees make during the peak time of the day. Suppose the 
average is 20 calls. What is the probability that the employees make more 
than 20 long distance phone calls during the peak time? 


These two examples illustrate two different types of probability problems 
involving discrete random variables. Recall that discrete data are data that 
you can count. A random variable describes the outcomes of a statistical 
experiment both in words. The values of a random variable can vary with 
each repetition of an experiment. 


In this chapter, you will study probability problems involving discrete 
random distributions. You will also study long-term averages associated 
with them. 


Random Variable Notation 


Upper case letters like X or Y denote a random variable. Lower case letters 
like x or y denote the value of a random variable. If X is a random 
variable, then X is defined in words. 


For example, let X = the number of heads you get when you toss three fair 
coins. The sample space for the toss of three fair coins is 

. Then, x = 0, 1, 2, 3. X is in words and z is 
a number. Notice that for this example, the x values are countable 
outcomes. Because you can count the possible values that X can take on 
and the outcomes are random (the z values 0, 1, 2, 3), X is a discrete 
random variable. 


Optional Collaborative Classroom Activity 


Toss a coin 10 times and record the number of heads. After all members of 
the class have completed the experiment (tossed a coin 10 times and 
counted the number of heads), fill in the chart using a heading like the one 
below. Let X = the number of heads in 10 tosses of the coin. 


xX Frequency of X Relative Frequency of X 


e Which value(s) of X occurred most frequently? 
e If you tossed the coin 1,000 times, what values would X take on? 
Which value(s) of X do you think would occur most frequently? 


e¢ What does the relative frequency column sum to? 


Glossary 


Random Variable (RV) 
see Variable 


Variable (Random Variable) 
A characteristic of interest in a population being studied. Common 
notation for variables are upper case Latin letters X, Y, Z,...; common 
notation for a specific value from the domain (set of all possible values 
of a variable) are lower case Latin letters x, y, z,.... For example, if X 
is the number of children in a family, then x represents a specific 
integer 0, 1, 2, 3, .... Variables in statistics differ from variables in 
intermediate algebra in two following ways. 


e The domain of the random variable (RV) is not necessarily a 
numerical set; the domain may be expressed in words; for 
example, if X = hair color then the domain is {black, blond, gray, 
green, orange}. 

e We can tell what specific value xz of the Random Variable X takes 
only after performing the experiment. 


5.7 Discrete Random Variables: Homework TCC 
This module provides a number of homework exercises related to Discrete 


Random Variables. 
Exercise: 


Problem: 1. Complete the PDF and answer the questions. 


£ P(X = 2) z- P(X =2) 
0 0.3 

al 0.2 

2 

3 0.4 


e aFind the probability that X = 2. 
e b Find the expected value. 


Solution: 


e a0.1 
e b1.6 


Exercise: 


Problem: 


Suppose that you are offered the following “deal.” You roll a die. If 
you roll a 6, you win $10. If you roll a 4 or 5, you win $5. If you roll a 
1, 2, or 3, you pay $6. 


e¢ aWhat are you ultimately interested in here (the value of the roll 
or the money you win)? 

e bln words, define the Random Variable X. 

e cList the values that X may take on. 

e dConstruct a PDF. 

e eOver the long run of playing this game, what are your expected 
average winnings per game? 

e fBased on numerical values, should you take the deal? Explain 
your decision in complete sentences. 


Exercise: 


Problem: 


A venture capitalist, willing to invest $1,000,000, has three 
investments to choose from. The first investment, a software company, 
has a 10% chance of returning $5,000,000 profit, a 30% chance of 
returning $1,000,000 profit, and a 60% chance of losing the million 
dollars. The second company, a hardware company, has a 20% chance 
of returning $3,000,000 profit, a 40% chance of returning $1,000,000 
profit, and a 40% chance of losing the million dollars. The third 
company, a biotech firm, has a 10% chance of returning $6,000,000 
profit, a 70% of no profit or loss, and a 20% chance of losing the 
million dollars. 


aConstruct a PDF for each investment. 

bFind the expected value for each investment. 

cWhich is the safest investment? Why do you think so? 
dWhich is the riskiest investment? Why do you think so? 
eWhich investment has the highest expected return, on average? 


Solution: 


¢ b$200,000;$600,000;$400,000 
e cthird investment 

e dfirst investment 

e esecond investment 


Exercise: 


Problem: 


A theater group holds a fund-raiser. It sells 100 raffle tickets for $5 
apiece. Suppose you purchase 4 tickets. The prize is 2 passes to a 
Broadway show, worth a total of $150. 


e aWhat are you interested in here? 

e bin words, define the Random Variable X. 

e cList the values that X may take on. 

e dConstruct a PDF. 

e elf this fund-raiser is repeated often and you always purchase 4 
tickets, what would be your expected average winnings per 
game? 


Exercise: 


Problem: 


Suppose that 20,000 married adults in the United States were randomly 
surveyed as to the number of children they have. The results are 
compiled and are used as theoretical probabilities. Let X = the number 
of children 


0 0.10 
a 0.20 
Z 0.30 
3 

4 0.10 
5 0.05 
6 (or more) 0.05 


aFind the probability that a married adult has 3 children. 

bIn words, what does the expected value in this example 
represent? 

c Find the expected value. 

d Is it more likely that a married adult will have 2 — 3 children or 
4 —6 children? How do you know? 


Solution: 


a0.2 
€2,35 
d2-3 children 


Exercise: 


Problem: 


Suppose that the PDF for the number of years it takes to earn a 
Bachelor of Science (B.S.) degree is given below. 


3 0.05 
4 0.40 
is) 0.30 
6 0.15 
7 0.10 


e aln words, define the Random Variable X. 

e b What does it mean that the values 0, 1, and 2 are not included 
for X on the PDF? 

e cOn average, how many years do you expect it to take for an 
individual to earn a B.S.? 


For each problem: 


e aln words, define the Random Variable X. 
e bList the values hat X may take on. 
e cGive the distribution of X. X~ 


Then, answer the questions specific to each individual problem. 
Exercise: 
Problem: 


Six different colored dice are rolled. Of interest is the number of dice 
that show a “1.” 


e dOn average, how many dice would you expect to show a “1”? 
e eFind the probability that all six dice show a “1.” 


e fls it more likely that 3 or that 4 dice will show a “1”? Use 
numbers to justify your answer numerically. 


Solution: 


e aX =the number of dice that show a 1 
e¢ b0,1,2,3,4,5,6 

¢ cX~B(6,<) 

edi 

e e 0.00002 

e f3 dice 


Exercise: 


Problem: 


According to a 2003 publication by Waits and Lewis (source: 

US. public two-year colleges offered distance learning courses. 
Suppose you randomly pick 13 U.S. public two-year colleges. We are 
interested in the number that offer distance learning courses. 


e dOn average, how many schools would you expect to offer such 
courses? 

e eFind the probability that at most 6 offer such courses. 

e fls it more likely that 0 or that 13 will offer such courses? Use 
numbers to justify your answer numerically and answer in a 
complete sentence. 


Exercise: 


Problem: 


A school newspaper reporter decides to randomly survey 12 students 
to see if they will attend Tet festivities this year. Based on past years, 
she knows that 18% of students attend Tet festivities. We are interested 
in the number of students who will attend the festivities. 


e¢ dHow many of the 12 students do we expect to attend the 
festivities? 

e eFind the probability that at most 4 students will attend. 

e fFind the probability that more than 2 students will attend. 


Solution: 


e a X =the number of students that will attend Tet. 
e bO, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 

¢ c X~B(12,0.18) 

e d2.16 

e e0.9511 

e £0.3702 


Exercise: 


Problem: 


Suppose that about 85% of graduating students attend their graduation. 
A group of 22 graduating students is randomly chosen. 


e dHow many are expected to attend their graduation? 

e eFind the probability that 17 or 18 attend. 

e fBased on numerical values, would you be surprised if all 22 
attended graduation? Justify your answer numerically. 


Exercise: 


Problem: 


At The Fencing Center, 60% of the fencers use the foil as their main 
weapon. We randomly survey 25 fencers at The Fencing Center. We 
are interested in the numbers that do not use the foil as their main 
weapon. 


e dHow many are expected to not use the foil as their main 
weapon? 


e eFind the probability that six do not use the foil as their main 
weapon. 

e fBased on numerical values, would you be surprised if all 25 did 
not use foil as their main weapon? Justify your answer 
numerically. 


Solution: 


e aX =the number of fencers that do not use foil as their main 
weapon 

e bO, 1, 2, 3,... 25 

e c X~B(25,0.40) 

e di0 

e e0.0442 

e fYes 


Exercise: 


Problem: 


Approximately 8% of students at a local high school participate in 
after-school sports all four years of high school. A group of 60 seniors 
is randomly chosen. Of interest is the number that participated in after- 
school sports all four years of high school. 


e dHow many seniors are expected to have participated in after- 
school sports all four years of high school? 

e eBased on numerical values, would you be surprised if none of 
the seniors participated in after-school sports all four years of 
high school? Justify your answer numerically. 

e fBased upon numerical values, is it more likely that 4 or that 5 of 
the seniors participated in after-school sports all four years of 
high school? Justify your answer numerically. 


Try these multiple choice problems. 


For the next three problems: The probability that the San Jose Sharks will 
win any given game is 0.3694 based on their 13 year win history of 382 
wins out of 1034 games played (as of a certain date). Their 2005 schedule 
for November contains 12 games. Let X= number of games won in 
November 2005 

Exercise: 


Problem: 
The expected number of wins for the month of November 2005 is: 


e A1.67 


¢e B12 


382 
° C rR 


e D443 


Solution: 


D: 4.43 
Exercise: 


Problem: 


What is the probability that the San Jose Sharks win 6 games in 
November? 


e A0.1476 
e B0.2336 
e C0.7664 
e DO0.8903 


Solution: 


A: 0.1476 


Exercise: 


Problem: 


Find the probability that the San Jose Sharks win at least 5 games in 
November. 


e A0.3694 
e¢ B0.5266 
e C0.4734 
e DO.2305 


Solution: 


C: 0.4734 


5.8 Discrete Random Variables: Mid Material Review TCC 
This module provides a number of homework/review exercises 
summarizing topics related to Discrete Random Variables. 
Exercise: 


Problem: 


A sociologist wants to know the opinions of employed adult women 
about government funding for day care. She obtains a list of 520 
members of a local business and professional women’s club and mails 
a questionnaire to 100 of these women selected at random. 68 
questionnaires are returned. What is the population in this study? 


e AAIll employed adult women 

e BAI] the members of a local business and professional women’s 
club 

e CThe 100 women who received the questionnaire 

e DAI! employed women with children 


Solution: 


A 


The next two questions refer to the following: An article from The San Jose 
Mercury News was concemed with the racial mix of the 1500 students at 
Prospect High School in Saratoga, CA. The table summarizes the results. 
(Male and female values are approximate. ) 


Ethnic 
Group 


Ethnic 
Group 


Gender White Asian Hispanic Black 


Male 400 168 115 35 
Female 440 132 140 40 
Exercise: 


American 
Indian 


16 


14 


Problem: Find the probability that a student is Asian or Male. 


Solution: 


0.5773 
Exercise: 


Problem: 


Find the probability that a student is Black given that the student is 


Female. 


Solution: 


0.0522 
Exercise: 


Problem: 


A sample of pounds lost, in a certain month, by individual members of 


a weight reducing clinic produced the following statistics: 


e Mean = 5 lbs. 
e Median = 4.5 lbs. 
e Mode = 4 lbs. 


e Standard deviation = 3.8 lbs. 
e First quartile = 2 lbs. 
e Third quartile = 8.5 lbs. 


The correct statement is: 


e AOne fourth of the members lost exactly 2 pounds. 

e BThe middle fifty percent of the members lost from 2 to 8.5 lbs. 
e CMost people lost 3.5 to 4.5 Ibs. 

¢ DAI! of the choices above are correct. 


Solution: 


B 
Exercise: 
Problem: 


What does it mean when a data set has a standard deviation equal to 
Zero? 


e AAIl values of the data appear with the same frequency. 
e BThe mean of the data is also zero. 

CAIl of the data have the same value. 

e DtThere are no data to begin with. 


Solution: 


C 


Exercise: 


Problem: The statement that best describes the illustration below is: 


e AThe mean is equal to the median. 
¢ BThere is no first quartile. 
e CThe lowest data value is the median. 


e DThe median equals 


Solution: 


C 
Exercise: 
Problem: 
A “friend” offers you the following “deal.” For a $10 fee, you may 


pick an envelope from a box containing 100 seemingly identical 
envelopes. However, each envelope contains a coupon for a free gift. 


e 10 of the coupons are for a free gift worth $6. 
e 80 of the coupons are for a free gift worth $8. 
¢ 6 of the coupons are for a free gift worth $12. 
e 4 of the coupons are for a free gift worth $40. 


Based upon the financial gain or loss over the long run, should you 
play the game? 


e AYes, I expect to come out ahead in money. 
e BNo, I expect to come out behind in money. 
e ClIt doesn’t matter. I expect to break even. 


Solution: 


B 


The next four questions refer to the following: Recently, a nurse 
commented that when a patient calls the medical advice line claiming to 
have the flu, the chance that he/she truly has the flu (and not just a nasty 
cold) is only about 4%. Of the next 25 patients calling in claiming to have 
the flu, we are interested in how many actually have the flu. 

Exercise: 


Problem: Define the Random Variable and list its possible values. 
Solution: 


= the number of patients calling in claiming to have the flu, who 
actually have the flu. =0, 1, 2,...25 


Exercise: 


Problem: State the distribution of 


Solution: 


Exercise: 


Problem: 


Find the probability that at least 4 of the 25 patients actually have the 
flu. 


Solution: 


0.0165 
Exercise: 


Problem: 


On average, for every 25 patients calling in, how many do you expect 
to have the flu? 


Solution: 


A 


The next two questions refer to the following: Different types of writing can 
sometimes be distinguished by the number of letters in the words used. A 
student interested in this fact wants to study the number of letters of words 
used by Tom Clancy in his novels. She opens a Clancy novel at random and 
records the number of letters of the first 250 words on the page. 

Exercise: 


Problem: What kind of data was collected? 
e Aqualitative 
e Bquantitative - continuous 
e Cquantitative — discrete 
Solution: 
C 
Exercise: 


Problem: What is the population under study? 


Solution: 


All words used by Tom Clancy in his novels 


5.9 Discrete Random Variables: Practice 1: Discrete Distributions TCC 
This module provides students an opportunity to practice applying concepts 
related to discrete distributions. This practice exercise asks students to 
calculate several values based on the data provided. 


Student Learning Outcomes 


e The student will analyze the properties of a discrete distribution. 


Given: 


A ballet instructor is interested in knowing what percent of each year's class 
will continue on to the next, so that she can plan what classes to offer. Over 
the years, she has established the following probability distribution. 


e Let =the number of years a student will study ballet with the 


teacher. 
e Let = the probability that a student will study ballet years. 


Organize the Data 


Complete the table below using the data provided. 


x P(x) x*P(x) 
1 0.10 
2 0.05 


3 0.10 


x P(x) x*P(x) 


4 

5 0.30 

6 0.20 

Zz 0.10 
Exercise: 


Problem: In words, define the Random Variable 


Exercise: 


Problem: 


Exercise: 


Problem: 
Exercise: 


Problem: 
On average, how many years would you expect a child to study ballet 
with this teacher? 

Discussion Question 

Exercise: 


Problem: What does the column " "sum to and why? 


Exercise: 


Problem: What does the column " " sum to and why? 


5.2 Discrete Random Variables: Probability Distribution Function (PDF) for 
a Discrete Random Variable TCC 

This module introduces the Probability Distribution Function (PDF) and its 
characteristics. 


A discrete probability distribution function has two characteristics: 


e Each probability is between 0 and 1, inclusive. 
e The sum of the probabilities is 1. 


Example: 

A child psychologist is interested in the number of times a newborn baby's 
crying wakes its mother after midnight. For a random sample of 50 
mothers, the following information was obtained. Let X = the number of 
times a newborn wakes its mother after midnight. For this example, x = 0, 
23.4, b- 

P(x) = probability that X takes on a value z. 


x P(x) 

0 PG=0) = = 
1 Pix) _ 
2 Px=2) = ~ 
3 Puss) = 
4 Pix 4) = 


5 5p AG 


X takes on the values 0, 1, 2, 3, 4, 5. This is a discrete PDF because 


1. Each P(x) is between 0 and 1, inclusive. 
2. The sum of the probabilities is 1, that is, 


Equation: 


Zp lies eat a es 
50 50 50 50 50 50, 


Example: 


Suppose Nancy has classes 3 days a week. She attends classes 3 days a 
week 80% of the time, 2 days 15% of the time, 1 day 4% of the time, and 


no days 1% of the time. Suppose one week is randomly selected. 
Exercise: 


Problem: 

Let X = the number of days Nancy 

Solution: 

Let X = the number of days Nancy attends class per week. 
Exercise: 

Problem: X takes on what values? 

Solution: 


Oo. 2.-and:s 


Exercise: 


Problem: 


Suppose one week is randomly chosen. Construct a probability 
distribution table (called a PDF table) like the one in the previous 
example. The table should have two columns labeled x and P(x). 
What does the P(x) column sum to? 


Solution: 
x P(x) 
0 0.01 
1 0.04 
2 0.15 
3 0.80 
Glossary 


Probability Distribution Function (PDF) 
A mathematical description of a discrete random variable (RV), given 
either in the form of an equation (formula) , or in the form of a table 
listing all the possible outcomes of an experiment and the probability 
associated with each outcome. 


Example: 

A biased coin with probability 0.7 for a head (in one toss of the coin) is 
tossed 5 times. We are interested in the number of heads (the RV X = the 
number of heads). X is Binomial, so X ~ B(5,0.7) and P(X = x) = 


5 
_7°.3°-*or in the form of the table: 


x 
L P(x = 2) 
0 0.0024 
1 0.0284 
2 0.1323 
3 0.3087 
4 0.3602 


5.3 Discrete Random Variables: Mean or Expected Value and Standard 
Deviation TCC 

This module explores the Law of Large Numbers, the phenomenon where 
an experiment performed many times will yield cumulative results closer 
and closer to the theoretical mean over time. 


The expected value is often referred to as the "long-term" average or 
mean . This means that over the long term of doing an experiment over and 
over, you would expect this average. 


The mean of a random variable X is p. If we do an experiment many times 
(for instance, flip a fair coin, as Karl Pearson did, 24,000 times and let X = 
the number of heads) and record the value of X each time, the average is 
likely to get closer and closer to uz as we keep repeating the experiment. 
This is known as the Law of Large Numbers. 


Note:To find the expected value or long term average, jz, simply multiply 
each value of the random variable by its probability and add the products. 


A Step-by-Step Example 

A men's soccer team plays soccer 0, 1, or 2 days a week. The probability 
that they play O days is 0.2, the probability that they play 1 day is 0.5, and 
the probability that they play 2 days is 0.3. Find the long-term average, jp, 
or expected value of the days per week the men's soccer team plays soccer. 


To do the problem, first let the random variable X = the number of days the 
men's soccer team plays soccer per week. X takes on the values 0, 1, 2. 
Construct a PDF table, adding a column xP(x). In this column, you will 
multiply each z value by its probability. 


x P(x) tP(x) 


0 0.2 (0)(0.2) = 0 
1 0.5 (1)(0.5) = 0.5 
2 0.3 (2)(0.3) = 0.6 


Expected Value TableThis table is called an expected value table. The table 
helps you calculate the expected value or long-term average. 


Add the last column to find the long term average or expected value: 
(0)(0.2)+(1)(0.5)+(2)(0.3)= 0 + 0.5 + 0.6 = 1.1. 


The expected value is 1.1. The men's soccer team would, on the average, 
expect to play soccer 1.1 days per week. The number 1.1 is the long term 
average or expected value if the men's soccer team plays soccer week after 
week after week. We say H=1.1 


Example: 

Find the expected value for the example about the number of times a 
newborn baby's crying wakes its mother after midnight. The expected 
value is the expected number of times a newborn wakes its mother after 
midnight. 


2 P(X) xP(X 


—_—— 


0 P(x=0) = (0)(s5) =0 


1 P(x=1) = 5 (D(a) = 3 
2 P(x=2) = #3 (2)(3>) = 3 
3 PE) = a (3)(s5) = 3 
4 Ria (4)(4)=-4 
5 Poo, 6)(a)=4 


You expect a newborn to wake its mother after midnight 2.1 times, on the 
average. 


Add the last column to find the expected value. jz = Expected Value = 
105 __ 

9 = 2-1 

Exercise: 


Problem: 


Go back and calculate the expected value for the number of days 
Nancy attends classes a week. Construct the third column to do so. 


Solution: 


2.74 days a week. 


Example: 

Suppose you play a game of chance in which five numbers are chosen 
from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. A computer randomly selects five numbers 
from 0 to 9 with replacement. You pay $2 to play and could profit 
$100,000 if you match all 5 numbers in order (you get your $2 back plus 


$100,000). Over the long term, what is your expected profit of playing the 
game? 

To do this problem, set up an expected value table for the amount of money 
you can profit. 

Let X = the amount of money you profit. The values of x are not 0, 1, 2, 3, 
4, 5, 6, 7, 8, 9. Since you are interested in your profit (or loss), the values 
of x are 100,000 dollars and -2 dollars. 

To win, you must get all 5 numbers correct, in order. The probability of 
choosing one correct number is ay because there are 10 numbers. You may 
choose a number more than once. The probability of choosing all 5 
numbers correctly and in order is: 

Equation: 


—_*__* __* ___*__*_ 1*19~° = 0.00001 
10 10 10 10 10 ees 


Therefore, the probability of winning is 0.00001 and the probability of 
losing is 
Equation: 


1 — 0.00001 = 0.99999 


The expected value table is as follows. 


x P(x) xzP(x) 
Loss 2 0.99999 (-2)(0.99999)=-1.99998 
Profit 100,000 0.00001 (100000)(0.00001)=1 


Add the last column. -1.99998 + 1 = -0.99998 


Since —0.99998 is about —1, you would, on the average, expect to lose 
approximately one dollar for each game you play. However, each time you 
play, you either lose $2 or profit $100,000. The $1 is the average or 
expected LOSS per game after playing this game over and over. 


Example: 
Suppose you play a game with a biased coin. You play each game by 
tossing the coin once. P(heads) = 3 and P(tails) = 4. If you toss a 


head, you pay $6. If you toss a tail, you win $10. If you play this game 
many times, will you come out ahead? 
Exercise: 


Problem: Define a random variable X. 


Solution: 


X = amount of profit 


Exercise: 


Problem: Complete the following expected value table. 


eo] 


WIN 10 


LOSE 


Solution: 


x P(x) we Ge) 
WIN 10 + 2 
LOSE -6 + 2 


Exercise: 
Problem: What is the expected value, 2? Do you come out ahead? 
Solution: 


Add the last column of the table. The expected value zp = =. You 


lose, on average, about 67 cents each time you play the game so you 
do not come out ahead. 


Like data, probability distributions have standard deviations. You use this 
formula for the calculation. 


o = xe — 1)? * P(x) 


To calculate the standard deviation (@) of a probability distribution, find 
each deviation from its expected value, square it, multiply it by its 
probability, add the products, and take the square root . To understand how 
to do the calculation, look at the table for the number of days per week a 
men's soccer team plays soccer. To find the standard deviation, add the 


entries in the column labeled (x — pu)? («) and take the square root. 


x P(x) aP(x) (x -1)’P(x) 

o | 02 (0)(0.2) = 0 (0 — 1.1)? 2) — 0.242 
1 05 (1)(0.5) = 0.5 aaa (5) = 0.005 
2 | 03 (2)(0.3) = 0.6 Calan (.3) = 0.243 


Add the last column in the table. 0.242 + 0.005 + 0.243 = 0.490. The 
standard deviation is the square root of 0.49. 0 = V0.49 = 0.7 


Generally for probability distributions, we use a calculator or a computer to 
calculate jz and o to reduce roundoff error. For some probability 
distributions, there are short-cut formulas that calculate jz and o. 


Glossary 


Expected Value 
Expected arithmetic average when an experiment is repeated many 
times. (Also called the mean). Notations: F(a), w. For a discrete 


random variable (RV) with probability distribution function P(a),the 
definition can also be written in the form E(x) = w = 5) xP(z). 


Mean 
A number that measures the central tendency. A common name for 
mean is 'average.' The term 'mean' is a shortened form of ‘arithmetic 


mean.’ By definition, the mean for a sample (denoted by 2) is 


__ Sum of all values in the sample dth f lati 
< = Number of values in the sample ” an € mean for a population 


s __ Sum of all values in the population 
(denoted by Lt) 1S & = Number of values in the population ° 


5.4 Discrete Random Variables: Binomial TCC 
This module describes the characteristics of a binomial experiment and the 
binomial probability distribution function. 


The characteristics of a binomial experiment are: 


1. 


Z. 


There are a fixed number of trials. Think of trials as repetitions of an 
experiment. The letter m denotes the number of trials. 

There are only 2 possible outcomes, called "success" and, "failure" for 
each trial. The letter p denotes the probability of a success on one trial 
and q denotes the probability of a failure on one trial. p + q = 1. 


. The 7 trials are independent and are repeated using identical 


conditions. Because the n trials are independent, the outcome of one 
trial does not help in predicting the outcome of another trial. Another 
way of saying this is that for each individual trial, the probability, p, of 
a success and probability, g, of a failure remain the same. For example, 
randomly guessing at a true - false statistics question has only two 
outcomes. If a success is guessing correctly, then a failure is guessing 
incorrectly. Suppose Joe always guesses correctly on any statistics true 
- false question with probability p = 0.6. Then, q = 0.4 .This means 
that for every true - false statistics question Joe answers, his 
probability of success (p = 0.6) and his probability of failure (q = 0.4 
) remain the same. 


The outcomes of a binomial experiment fit a binomial probability 
distribution. The random variable X = the number of successes obtained 
in the n independent trials. 


The mean, jz, and variance, o”, for the binomial probability distribution is 
y. = np and o? = npq. The standard deviation, a, is then o = ,/npq. 


Any experiment that has characteristics 2 and 3 and where n = 1 is calleda 
Bernoulli Trial (named after Jacob Bernoulli who, in the late 1600s, 
studied them extensively). A binomial experiment takes place when the 
number of successes is counted in one or more Bernoulli Trials. 


Example: 

At ABC College, the withdrawal rate from an elementary physics course is 
30% for any given term. This implies that, for any given term, 70% of the 
students stay in the class for the entire term. A "success" could be defined 
as an individual who withdrew. The random variable is X = the number of 
students who withdraw from the randomly selected elementary physics 
class. 


Example: 

Suppose you play a game that you can only either win or lose. The 
probability that you win any game is 55% and the probability that you lose 
is 45%. Each game you play is independent. If you play the game 20 times, 
what is the probability that you win 15 of the 20 games? Here, if you 
define X = the number of wins, then X takes on the values 0, 1, 2, 3, ..., 
20. The probability of a success is p = 0.55. The probability of a failure is 
q = 0.45. The number of trials ism = 20. The probability question can be 
stated mathematically as P(a = 15). 


Example: 

A fair coin is flipped 15 times. Each flip is independent. What is the 
probability of getting more than 10 heads? Let X = the number of heads in 
15 flips of the fair coin. X takes on the values 0, 1, 2, 3, ..., 15. Since the 
coin is fair, p = 0.5 and q = 0.5. The number of trials is n = 15. The 
probability question can be stated mathematically as P(a > 10). 


Example: 

Approximately 70% of statistics students do their homework in time for it 
to be collected and graded. Each student does homework independently. In 
a Statistics class of 50 students, what is the probability that at least 40 will 
do their homework on time? Students are selected randomly. 

Exercise: 


Problem: 


This is a binomial problem because there is only a success or a 
, there are a definite number of trials, and the probability 
of a success is 0.70 for each trial. 


Solution: 


failure 
Exercise: 


Problem: 


If we are interested in the number of students who do their homework, 
then how do we define X? 


Solution: 

X =the number of statistics students who do their homework on time 
Exercise: 

Problem: What values does zx take on? 


Solution: 
Ole 0) 
Exercise: 


Problem: What is a "failure", in words? 


Solution: 


Failure is a student who does not do his or her homework on time. 


The probability of a success is p = 0.70. The number of trial is n = 50. 
Exercise: 


Problem: If p + gq = 1, then what is q? 
Solution: 


gq = 0.30 
Exercise: 
Problem: 


The words "at least" translate as what kind of inequality for the 
probability question P(a____40). 


Solution: 


greater than or equal to (=) 


The probability question is P(a > 40). 


Notation for the Binomial: B = Binomial Probability 
Distribution Function 


X ~ B(n,p) 


Read this as ".X is arandom variable with a binomial distribution." The 
parameters are n and p. n = number of trials p = probability of a success on 
each trial 


Example: 

It has been stated that about 41% of adult workers have a high school 
diploma but do not pursue any further education. If 20 adult workers are 
randomly selected, find the probability that at most 12 of them have a high 
school diploma but do not pursue any further education. How many adult 


workers do you expect to have a high school diploma but do not pursue 
any further education? 

Let X = the number of workers who have a high school diploma but do not 
pursue any further education. 

X takes on the values 0, 1, 2, ..., 20 where n = 20 and p = 0.41. q=1- 
0.41 = 0.59. X ~ B(20, 0.41) 

Find P(x < 12). P(a < 12) = 0.9738. (calculator or computer) 

Using the TI-83+ or the TI-84 calculators, the calculations are as follows. 
Go into 2nd DISTR. The syntax for the instructions are 

To calculate (x = value): binompdf(n, p, number) If "number" is left out, 
the result is the binomial probability table. 

To calculate P(x < value): binomcdf(n, p, number) If "number" is left 
out, the result is the cumulative binomial probability table. 

For this problem: After you are in 2nd DISTR, arrow down to 
binomcdf. Press ENTER. Enter 20,.41,12). The result is 

(12 O88: 


Note:If you want to find P(a = 12), use the pdf (binompdf). If you want 
to find P(x>12), use 1 - binomcdf(20,.41,12). 


The probability at most 12 workers have a high school diploma but do not 
pursue any further education is 0.9738 
The graph of x ~ B(20, 0.41) is: 


0123 4 Sires» 20 


The y-axis contains the probability of 2, where X = the number of workers 
who have only a high school diploma. 

The number of adult workers that you expect to have a high school 
diploma but not pursue any further education is the mean, 

fe — np —"(20)( OAT) 62; 

The formula for the variance is o“ = npq. The standard deviation is 

o = ./upq. o = »/(20)(0.41)(0.59) = 2.20. 


2 


Example: 

The following example illustrates a problem that is not binomial. It 
violates the condition of independence. ABC College has a student 
advisory committee made up of 10 staff members and 6 students. The 
committee wishes to choose a chairperson and a recorder. What is the 
probability that the chairperson and recorder are both students? All names 
of the committee are put into a box and two names are drawn without 
replacement. The first name drawn determines the chairperson and the 
second name the recorder. There are two trials. However, the trials are not 
independent because the outcome of the first trial affects the outcome of 
the second trial. The probability of a student on the first draw is +. The 


probability of a student on the second draw is — when the first draw 
produces a student. The probability is a when the first draw produces a 


staff member. The probability of drawing a student's name changes for 
each of the trials and, therefore, violates the condition of independence. 


Glossary 


Bernoulli Trials 
An experiment with the following characteristics: 


e There are only 2 possible outcomes called “success” and “failure” 
for each trial. 

e The probability p of a success is the same for any trial (so the 
probability g = 1 — p of a failure is the same for any trial). 


Binomial Distribution 
A discrete random variable (RV) which arises from Bernoulli trials. 
There are a fixed number, n, of independent trials. “Independent” 
means that the result of any trial (for example, trial 1) does not affect 
the results of the following trials, and all trials are conducted under the 
same conditions. Under these circumstances the binomial RV X is 
defined as the number of successes in n trials. The notation is: X~ 
B(n, p). The mean is js = np and the standard deviation is o = ,/npq 
. The probability of exactly z successes in 7 trials is 


P(X = 2) = (2 )p*q"*. 


5.6 Discrete Random Variables: Practice 2: Binomial Distribution TCC 
This module provides a practice of Binomial Distribution as a part of 
Collaborative Statistics collection (col10522) by Barbara Illowsky and 
Susan Dean. 


Student Learning Outcomes 


e The student will construct the Binomial Distribution. 


Given 


The Higher Education Research Institute at UCLA collected data from 
203,967 incoming first-time, full-time freshmen from 270 four-year 
colleges and universities in the U.S. 71.3% of those students replied that, 
yes, they believe that same-sex couples should have the right to legal 
marital status. (Source: 
http://heri.ucla.edu/PDFs/pubs/TFS/Norms/Monographs/TheAmericanFres 
hman2011.pdf). ) 


Suppose that you randomly pick 8 first-time, full-time freshmen from the 


survey. You are interested in the number that believes that same sex-couples 
should have the right to legal marital status 


Interpret the Data 
Exercise: 


Problem: In words, define the random Variable X. 


Solution: 


= the number that reply “yes” 


Exercise: 


Problem: ~ 


Solution: 


Exercise: 


Problem: What values does the random variable take on? 


Solution: 
0,1,2.5:4:5;6;7,8 


Exercise: 


Problem: Construct the probability distribution function (PDF). 


Exercise: 


Problem: 


On average , how many would you expect to answer yes? 


Solution: 


D7 


Exercise: 


Problem: What is the standard deviation i 


Solution: 


1.28 
Exercise: 


Problem: 


What is the probability that at most 5 of the freshmen reply “yes”? 


Solution: 


0.4151 
Exercise: 


Problem: 
What is the probability that at least 2 of the freshmen reply “yes”? 
Solution: 


0.9990 
Exercise: 


Problem: 


Construct a histogram or plot a line graph. Label the horizontal and 
vertical axes with words. Include numerical scaling. 


